The genome sequence of a metallic wood-boring beetle, Agrilus cyanescens (Ratzeburg, 1837)

We present a genome assembly from an individual female Agrilus cyanescens (metallic wood-boring beetle; Arthropoda; Insecta; Coleoptera; Buprestidae). The genome sequence is 292.3 megabases in span. Most of the assembly is scaffolded into 10 chromosomal pseudomolecules, including the X sex chromosome. The mitochondrial genome has also been assembled and is 15.91 kilobases in length.


Background
Agrilus cyanescens Ratzeburg, 1837, is a metallic wood-boring beetle from the family Buprestidae, or jewel beetles.Like other species in the Agrilus genus, cyanescens can be distinguished from other Buprestids by possessing paired toothed tarsal claws, a marginal groove in the abdominal sternites and its hind 1 tarsomere measuring longer than the 2 and 3 together (Duff & Schmidt, 2020).A. cyanescens can be distinguished from other members of its genus by its possession of frons with a deep longitudinal furrow and a pronotum without keels near the hind angles, A. cyanescens measures between 4.5-7 mm and is metallic blue in colouration (Duff & Schmidt, 2020;Hodge, 2010).This species also displays a unique form of sexual dimorphism in comparison to other Agrilus species: the male A. cyanescens is characterised by a more deeply emarginated and robust prosternal lobe, as well as sharply protruding metacoxal plates (Hodge, 2010).
Agrilus cyanescens is distributed from southwestern China across to Turkey and through to Syria, it is also found west through to France and then reaches to western Iberia (Niehuis, 2004).A. cyanescens has also been recorded in Luxembourg, regionally in Italy, northwards to Denmark but is excluded in Scandinavia -in Germany it can be found throughout and widespread in lower mountain areas (Niehuis, 2004).A. cyanescens is a recent species to the UK, first recorded in South Essex/Eastern London in 2008, it has also been recorded in Cambridgeshire and Hertfordshire (Jendek & Grebennikov, 2009).Other species of Agrilus were fairly widespread in the UK pre-1980, however, since then records show higher populations around the London/Essex area spreading southwards to the coast and Devon, with some records spreading to eastern Wales and across to Norfolk (Alexander, 2003).The apparent reduction in recording sites for these species has been attributed to loss in host plant species and reductions/fragmentations in habitat (Brown et al., 2015), thus, although a new arrival to the UK, the range of A. cyanescens may also become limited for similar reasons, highlighting the importance of gathering species records and gaining full barcode and genome records to help monitor populations of Agrilus in the UK.
Agrilus cyanescens resides in deciduous and lowland forests, occasionally at high altitudes where windbreak basins are present, but has been recorded in areas with high honeysuckle density and occasionally on motorway verges (Niehuis, 2004).A. cyanescens is a polyphagous species, recorded as feeding on plants from Quercus, Fagus, Fraxinus, Alnus, Betula, Acer, Ulmus, Castanea, Salix, Populus and Lonicera genera (Niehuis, 2004).Eggs of this species are singly laid in crevices in the bark of trees around 15-30 cm above soil level.The larva bore gaps into wood before pupation, creating a hook-shaped pupa in the sapwood layer (Niehuis, 2004).Pupal development lasts one year, with adults recorded in flight usually from June to July but have also been sighted from August to September (Niehuis, 2004).
The genome of Agrilus cyanescens was sequenced as part of the Darwin Tree of Life Project, a collaborative effort to sequence all named eukaryotic species in the Atlantic Archipelago of Britain and Ireland.Here we present a chromosomally complete genome sequence for Agrilus cyanescens, based on one specimen collected from Wytham Woods, Oxfordshire.The publication of the first genus-wide DNA reference library of Holarctic Agrilus (Kelnarova et al., 2019) enabled the Darwin Tree of Life Project to successfully barcode match A. cyanescens to the barcodes available in this library.

Genome sequence report
The genome was sequenced from one female Agrilus cyanescens (Figure 1) collected from Wytham Woods, Oxfordshire, UK (51.77,.A total of 56-fold coverage in Pacific Biosciences single-molecule HiFi long reads was generated.Primary assembly contigs were scaffolded with chromosome conformation Hi-C data.Manual assembly curation corrected 45 missing joins or mis-joins and removed 1 haplotypic duplications, reducing the scaffold number by 75.00%, and increasing the scaffold N50 by 109.22%. The final assembly has a total length of 292.3 Mb in 10 sequence scaffolds with a scaffold N50 of 29.7 Mb (Table 1).The snailplot in Figure 2 provides a summary of the assembly statistics, while the distribution of assembly scaffolds on GC proportion and coverage is shown in Figure 3.The cumulative assembly plot in Figure 4 shows curves for subsets of scaffolds assigned to different phyla.Most (99.99%) of the assembly sequence was assigned to 10 chromosomal-level scaffolds, representing 10 autosomes and the XX sex chromosome.Chromosome-scale scaffolds confirmed by the Hi-C data are named in order of size (Figure 5; Table 2).While not fully phased, the assembly deposited is of one haplotype.Contigs corresponding to the second haplotype have also been deposited.The mitochondrial genome was also assembled and can  be found as a contig within the multifasta file of the genome submission.
Metadata for specimens, barcode results, spectra estimates, sequencing runs, contaminants and pre-curation assembly statistics are given at https://links.tol.sanger.ac.uk/species/1586972.In sample preparation, the icAgrCyan1 sample was weighed and dissected on dry ice (Jay et al., 2023).Tissue from the whole organism was homogenised using a PowerMasher II tissue disruptor (Denton et al., 2023a).HMW DNA was extracted in the WSI Scientific Operations core using the Automated MagAttract v2 protocol (Oatley et al., 2023).HMW DNA was sheared into an average fragment size of

Sequencing
Pacific Biosciences HiFi circular consensus DNA sequencing libraries were constructed according to the manufacturers' instructions.DNA sequencing was performed by the Scientific Operations core at the WSI on a Pacific Biosciences SEQUEL II instruments.Hi-C data were also generated from remaining tissue of icAgrCyan1 using the Arima2 kit and sequenced on the Illumina NovaSeq 6000 instrument.

Genome assembly, curation and evaluation
Assembly was carried out with Hifiasm (Cheng et al., 2021) and haplotypic duplication was identified and removed with purge_dups (Guan et al., 2020).The assembly was then scaffolded with Hi-C data (Rao et al., 2014) using YaHS (Zhou et al., 2023).The assembly was checked for contamination and corrected using the gEVAL system (Chow et al., 2016)    Further, the Wellcome Sanger Institute employs a process whereby due diligence is carried out proportionate to the nature

Software tool Version
of the materials themselves, and the circumstances under which they have been/are to be collected and provided for use.
The purpose of this is to address and mitigate any potential legal and/or ethical implications of receipt and use of the materials as part of the research project, and to ensure that in doing so we align with best practice wherever possible.The overarching areas of consideration are: • Ethical review of provenance and sourcing of the material The assembly described is produced using cutting edge techniques and seems to be of very high quality.The genome is reported at 292.3Mb in span and assembled over 10 chromosomal scaffolds.N50 is 29.7Mb,BUSCO completeness is 97.8%,QV is 64.4.
The methods section provides a thorough description of all collection and wet lab methods used, with the obvious exception of the Hi-C protocol used.The methods are not reproducible due to this.Hi-C is a complex and often highly species specific approach, more detail is needed to make this paper a useful resource for the community.
The computational methods are clear but should be more detailed -what settings are used.Again, not reproducible.
No annotation is provided for this species.
Data and assembly available through ENA, genome specific accession could be included for clarity.

Is the rationale for creating the dataset(s) clearly described? Yes
Are the protocols appropriate and is the work technically sound?Yes

Are sufficient details of methods and materials provided to allow replication by others? No
Are the datasets clearly presented in a useable and accessible format?Yes Competing Interests: No competing interests were disclosed.

Lihuén Iraí González-Dominici
Microbiology and Genetics Department, University of Salamanca, Salamanca, Spain Dear all, I am pleased for the opportunity to review the publication titled "The genome sequence of a metallic wood-boring beetle, Agrilus cyanescens (Ratzeburg, 1837)".This article presents the genome assembly for an individual female metallic wood-boring beetle.The genome assembly is 292.3 megabases in length and has been scaffolded into 10 chromosomal pseudomolecules, including X sex chromosome.The mitochondrial genome assembly is also provided.This manuscript is well-written, concise and uses appropriate terminology.Relevant information on the biology, morphology and phylogeny of this species is included in the background, enhancing reader experience.The methodology used is sufficiently clear and ensures the reproducibility by other researchers.The overall quality of this publication is noteworthy, providing relevant information for future studies concerning A. cyanescens.However, I have a few minor suggestions.Firstly, in the background, I would appreciate the inclusion of information regarding the ecological significance of this insect.It is known that wood-boring beetles play a pivotal role in many ecosystems, as they have an influence on plant-associated microbial.Do we have any insights into the specific impact of this beetle species?This could be relevant information in stimulating further research within the field of microbiology.
○ Secondly, concerning Figure 5, while the order of the chromosomes is detailed clearly at the bottom of the figure, as a reader, it would help me if they were also presented in the plot.

○
Finally, I propose that an annotation of the genome of these insects greatly enriches the ○ knowledge of the important role of wood-boring beetles within ecosystems and thus increases the level of impact of this already valuable work.Naturally, this suggestion could be considered for future research.
Is the rationale for creating the dataset(s) clearly described?Yes Are the protocols appropriate and is the work technically sound?Yes

Are sufficient details of methods and materials provided to allow replication by others? Yes
Are the datasets clearly presented in a useable and accessible format?Yes Competing Interests: No competing interests were disclosed.

Reviewer Expertise: Microbiology and Bioinformatics
I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.
This is yet another high quality genome assembly from the Welcome Sanger Institute.As sequencing technologies have improved, so have genome assemblies.Progress has been truly impressive.I am not a reviewer who himself produces genome assemblies, rather I am one of those who uses genome assemblies in order to help understand some aspects of insect physiology, neuropeptides in my case.So I quickly checked the various neuropeptide genes that are present in insects, and the genome seems complete.However, as a user I like genome assemblies that are not only based on Biopac sequences, even though these have also enormously improved over the years, but include illumina genomic sequences.The latter are really useful to check whether indeed sequences in the assemblies that look like they might have an indel are indeed correct.Furthermore, I really like the many-fold coverage of illumina sequences as it allows one to convince oneself that a particular gene is truly lacking from a species.Obviously, it is also very helpful if there is some transcriptome data, but that is just icing on the cake.This is the only somewhat more negative point I see with this assembly, although I assume that there simply was not enough DNA from the single individual.
Two minor points.I think that if I wanted to sequence a genome from a new species, I would prefer a male, as it would include the Y chromosome.I also think this type of manuscript is probably most used, although perhaps not carefully read, by people like me who use the generated data.For us it might be useful to explain a bit what the white area in Figure 5 mean and what they imply.Does it indeed mean that those parts of the assembly are perhaps not correctly arranged ?
Is the rationale for creating the dataset(s) clearly described?Yes Are the protocols appropriate and is the work technically sound?Yes

Are sufficient details of methods and materials provided to allow replication by others? Yes
Are the datasets clearly presented in a useable and accessible format?Yes Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Insect neuropeptides I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Figure 2 .
Figure 2. Genome assembly of Agrilus cyanescens, icAgrCyan1.1:metrics.The BlobToolKit Snailplot shows N50 metrics and BUSCO gene completeness.The main plot is divided into 1,000 size-ordered bins around the circumference with each bin representing 0.1% of the 292,354,128 bp assembly.The distribution of scaffold lengths is shown in dark grey with the plot radius scaled to the longest scaffold present in the assembly (39,296,097 bp, shown in red).Orange and pale-orange arcs show the N50 and N90 scaffold lengths (29,744,825 and 22,923,229 bp), respectively.The pale grey spiral shows the cumulative scaffold count on a log scale with white scale lines showing successive orders of magnitude.The blue and pale-blue area around the outside of the plot shows the distribution of GC, AT and N percentages in the same bins as the inner plot.A summary of complete, fragmented, duplicated and missing BUSCO genes in the endopterygota_odb10 set is shown in the top right.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/icAgrCyan1_1/dataset/icAgrCyan1_1/snail.
A female Agrilus cyanescens (specimen ID Ox001780, ToLID icAgrCyan1) was potted in Wytham Woods, Oxfordshire (biological vice-county Berkshire), UK (latitude 51.77, longitude -1.34) on 2021-07-08.The specimen was collected and identified by Mark Telfer (independent entomological consultant) and preserved on dry ice.Protocols developed by the Wellcome Sanger Institute (WSI) Tree of Life core laboratory have been deposited on protocols.io(Denton et al., 2023b).The workflow for high molecular weight (HMW) DNA extraction at the WSI includes a sequence of core procedures: sample preparation; sample homogenisation, DNA extraction, fragmentation, and clean-up.

Figure 3 .
Figure 3. Genome assembly of Agrilus cyanescens, icAgrCyan1.1:BlobToolKit GC-coverage plot.Scaffolds are coloured by phylum.Circles are sized in proportion to scaffold length.Histograms show the distribution of scaffold length sum along each axis.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/icAgrCyan1_1/dataset/icAgrCyan1_1/blob.

Figure 4 .
Figure 4. Genome assembly of Agrilus cyanescens, icAgrCyan1.1:BlobToolKit cumulative sequence plot.The grey line shows cumulative length for all scaffolds.Coloured lines show cumulative lengths of scaffolds assigned to each phylum using the buscogenes taxrule.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/icAgrCyan1_1/dataset/icAgrCyan1_1/cumulative.

Figure 5 .
Figure 5. Genome assembly of Agrilus cyanescens, icAgrCyan1.1:Hi-C contact map of the icAgrCyan1.1 assembly, visualised using HiGlass.Chromosomes are shown in order of size from left to right and top to bottom.An interactive version of this figure may be viewed at https://genome-note-higlass.tol.sanger.ac.uk/l/?d=fNGco5NHRcqa3R73MKwNDA.

have read this submission and believe that I have an appropriate level of expertise to state that I do not consider it to be of an acceptable scientific standard, for reasons outlined above.
This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.